Siming Chen, Peking University, simingchen3@gmail.com, csm@pku.edu.cn PRIMARY (Point of contact for questions/answers)
Fabian Merkle, Universität Stuttgart, merklefn@studi.informatik.uni-stuttgart.de
Hanna Schäfer, Universität Stuttgart, schaefha@studi.informatik.uni-stuttgart.de
Hongwei Ai, Peking University, hongwei.ai@pku.edu.cn
Cong Guo, Peking University, cong.guo@pku.edu.cn
Xiaoru Yuan, Peking University, xiaoru.yuan@pku.edu.cn (Supervisor)
Thomas Ertl, Universität Stuttgart, Thomas.Ertl@vis.uni-stuttgart.de (Supervisor)
Student Team: YES
AnNetTe 安-内-特, developed by the Peking University's and University of Stuttgart's VAST collaboration team, 2013
May we post your submission in the Visual Analytics Benchmark Repository after VAST Challenge 2013 is complete?
Yes
Video:
http://vis.pku.edu.cn/vastvideo2013.wmv
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Questions
MC3.1 – Provide a timeline (i.e., events organized in chronological order) of the notable events that occur in Big Marketing’s computer networks for the two weeks of supplied data. Use all data at your disposal to identify up to twelve events and describe them to the extent possible. Your answer should be no more than 1000 words long and may contain up to twelve images.
Our events will be presented chronologically (timeline and tool description in question 3) except those occuring repeatedly, which will be named at their first occurrence.
1.
Mo 01 4.30pm after the regular work time connections the external IP traffic dminishes and disappears around 12am. The main activity remaining is the continuous scan of the health monitor 172.10.0.6. considering the pattern of the following nights, this is could be a network breakdown because of a earlier event or the unhealthy status.
2.
Tue 02 5am after night the first attack starts. IPs 10.6.6.6, 10.6.6.13, 10.6.6.14, 10.7.6.3, 10.7.7.10, 10.10.6.2, 10.11.6.15, 10.16.5.15 , 10.18.6.123 and 10.100.1.6 use many ports between 10000-70000 to acess the destination port 80 of the main http server 172.30.0.4. It is a DDOS because of high CPU load and the high connection payload. The health status becomes worse while the attack and recovers after the attacking IPs disappeared at 7am.
3.
Wed 03 9.30am after the regular rest of day 2 and day 3 IP 10.15.7.85 starts attacking with ports 200-70000 the port 80 of 172.20.0.15. Some other IPs participate, but never reach that extent. It is a scan because of high CPU load and the high payload of the connections. The health status becomes unhealthier while the attack and becomes healthy again after the attacking IPs disappeared at 7am.
4.
Sat 06 11.30am the regular day 4 and day 6 IPs 10.9.81.5 and 10.10.11.15 start attacking with ports from 50000-70000 towards many destination ports on server IPs of every company part. IP 10.10.11.15 only attacks 172.20.0 IPs and stops after 11.45am. Lower payload, but high destination ports and IP count indicate a network scan. The event ends on day 7 at 3.20am and becomes the regular night pattern. However after 7am system becomes very unhealthy and the administrators shut down the network after 9am for two days of installing preventive measures.
5.
Week 2: The system restarts on day 10 at 6.30am and firstly only allows internal IPs.When the network opens it becomes unhealthier again. Using the IPS log we could identify that the three IPs 10.13.77.49, 10.138.235.111 and 10.6.6.7 attack at several daytimes. Each event uses only some source ports from 30000-5000 and many destination ports of which a lot become blocked by a denial of connection logged in the IPS data. The attack aims at server IPs in all company parts. Lower payload, but high destination ports and IP deny logs indicate a network scan.The scans are most active on day 10 from 12.30pm–5.15pm, on day 14 from 11.10am–5.20pm and on day 15 from 7.45am–9.59am.
6.
Thu 11 10.30am after event 5 in day 10 and the night to day 11 are regularly. On day 11 at 10.30am the IPs 10.12.15.152 and 10.6.6.7 attack with some ports from 30000-70000 and a lot of destination ports the server IPs of all company parts. IP 10.6.6.7 goes from 12pm to 9.15pm and is only denied connecting in the second company part. Since CPU, IPS and bytes are high this attack probably consists of a DOS and a scan part. The health status stays constant but becomes better after the attack ends on day 12 at 6.20am.
7.
Thu 11 12.15-1.00pm during event 6 there is a hidden event which consists of a DOS from many bad guys converting into a backwards port scan from server 172.30.0.4 connecting to many external IPs with the port 80 and many destination ports. The CPU load and bytes are very high as well as the destination port entropy.
8.
Fri 12 10.30am after event 6 IPs 10.12.15.152 and 10.12.14.15 attack with many ports between 10000-70000 and destination ports 80 and 3389, which is used for remote desktop services, the servers of company part one and two. The event is a scan with many deny logs in the IPS data, high CPU load and high payload of the connections as well as low destination port numbers. The health status stays constant and only becomes better some hours after the attack ends on day 12 at 3.450pm.
9.
Sat 13 6am continuing event 6 and 8 IP 10.12.15.152 attacks with IP 10.17.15.10 using some source ports and destination ports 0, 25, 80 and 3389 of the servers of company part one and two. The event is a scan with many deny logs in the IPS data, high CPU load and high payload of the connections as well as low destination port numbers. The health status stays constant and becomes better after the attack ends on day 13 at 10.45pm.
10.
Sat 13 11.20pm-1.40am a reaction attack to event 9 is showing in two ways. First all the internal IPs connect to the broadcast IP 239.255.255.255. Second the IPs at 172.10.1 are have a high connection rate to the 10.0.0 IPs. Both could result from some virus given by event 9.
11.
Sun 14 2pm-3.10pm during event 5 on day 14 there is a short time in which many IPs (10.15.7.85, 10.12.15.152, 10.17.15.10, 10.12.14.15, 10.200.20.2, 10.156.165.120, 10.70.68.127, 10.250.178.101, 10.170.68.127, 10.179.32.181, 10.179.32.110, 10.78.100.150., 10.247.58.182, 10.247.106.27, 10.10.11.102) connect to 172.10.0.4, 172.20.0.4, and 172.30.0.4. This event is interesting, because it combines many IPs from earlier attacks in one event. They use many ports in both directions and also achieve a very unhealthy state for the server IPs.
12.
Sun 14 11.45 pm-1.45am is an exception from this regular night. At that time the system has no external connection, which was not done intentionally according to our question, but because of some network problems. Event though it is a regular night, the health status since the attack of event 5 on day 14 is the worst of both weeks. This might indicate, that the event 5 are strong enough to break down the network.
MC3.2 – Speculate on one or more narratives that describe the events on the network. Provide a list of analytic hypotheses and/or unanswered questions about the notable events. In other words, if you were to hand off your timeline to an analyst who will conduct further investigation, what confirmations and/or answers would you like to see in their report back to you? Your answer should be no more than 300 words long and may contain up to three additional images.
Hypothesis 2 is about the health status development. In the first week we can see that the accumulated health status of the system mostly is stable during the night, but the gets very unhealthy at the start of each day around 7am shortly. The same time there is a null point in the connection entropy, which might indicate a small breakdown. In the second week those changes around 7am are a health leak instead of a peak. At the same time the null point is less deep. These daily events are probably a result of what happens in hypothesis one, because the small breakdowns happen after the DOS or the scanning. This would also explain why the breakdowns are less in the first part of week 2 after blocking those malicious IPs. Unfortunately in day 13 the movement becomes unhealthy again, which might be caused by the attackers breaking the preventions of the administrators.
Thirdly we found great differences between source and destination variables of one connection, which should be investigated but isn’t resolved yet.
MC3.3 – Describe the role that your visual analytics played in enabling discovery of the notable events in MC3.1. Describe whether your visual analytics play a role in formulating the questions in MC3.2. Your answer should be no more than 300 words long and may contain up to three additional images.
To find out the events for question one we used our tool AnNeTe. It consists of one timeline using overviews, for IP/ Port entropy, CPU/byte load plus IPS and health data as well as one ring graph and one river view for the details to visualize the data connected in an interaction pipeline.
If we wanted to find any anomaly, we chose a time in the timeline and play the animation of the ring view. This way we see any anomal connection.
If we wanted to find a DOS attack we can use various features. First we take a look at the overview timeline. There we can search for peaks in the CPU load, the total bytes or in the entropy of the source ports. If we find such a peak, we select it and the refine our selection in the detailed timeline of the four entropy lines. Now we look at the ring graph and can easily see, which IP group is causing the peak. If we cannot find it easy, we can exclude some groups to reduce the clutter.